A Generalized Transformer-Based Pulse Detection Algorithm

Pulse-like signals are ubiquitous in the field of single molecule analysis, e.g., electrical or optical pulses caused by analyte translocations in nanopores. The primary challenge in processing pulse-like signals is to capture the pulses in noisy backgrounds, but current methods are subjectively based on a user-defined threshold for pulse recognition. Here, we propose a generalized machine-learning based method, named pulse detection transformer (PETR), for pulse detection. PETR determines the start and end time points of individual pulses, thereby singling out pulse segments in a time-sequential trace. It is objective without needing to specify any threshold. It provides a generalized interface for downstream algorithms for specific application scenarios. PETR is validated using both simulated and experimental nanopore translocation data. It returns a competitive performance in detecting pulses through assessing them with several standard metrics. Finally, the generalization nature of the PETR output is demonstrated using two representative algorithms for feature extraction.


Supporting Note 1: About mean average precision and our adaptation
There are cases for which the Pulse dEtection TRansformer (PETR) algorithm produces detections with high Intersection on Union (IoU) values, as showed in Figs. S1(a) and (b). Even when PETR does not produce a high IoU value for cases shown in Figs. S1(c) and (d), the detection duration is well predicted and pulses can be easily identified and individualized per detection. Consequently, basing the detection performance of PETR on computing mAP using IoU is not convenient given the characteristics of the task. Other alternatives could reflect more accurately the performance of the algorithm. Therefore, we used an adapted performance metric for the evaluation. The adaptation implemented to the metric in order to evaluate PETR is shown in Fig. S2. The classical IoU used to compute mAP can be seen in Fig. S2(b). Basically, pulses are pinpointed by computing the value of the intersection between the prediction and the ground truth and then dividing such a value by the union between both. This metric returns an idea of how well the prediction represents the ground truth. If a value of IoU equal to 0.5 is considered as threshold, it means that only predictions with IoUs above such a threshold will be considered as true positives. S2. Illustration of the difference between the classical IoU computation (b) and our adaptation computing the relative distance between the prediction and the ground truth (a). Stars represent the predictions produced by PETR while dots are the ground truths.

Fig.
Yet, such a metric neglects many good detections produced by PETR as the ones shown in Figs. S1 (c and d). Excepting for the pulse around 10.0 in Fig. S1(d), the IoU value for such cases is 0 and their predictions become neglected even when they are easily detectable. To mitigate, the metric applying the one shown in Fig. S2(a) is adapted. Instead of IoU, the distance between the mid-points of the prediction and the ground truth is computed. It is then divided by the duration of the ground truth. This new metric allows us to associate predictions with labels even in situations with their IoU being 0. For instance, the relative distance between the prediction and the ground truth in Fig. S1(c) is greater than 100%. Using a threshold of 400% could be enough to catch such a pulse with the prediction produced by PETR. This enables us to consider such a prediction as a true positive and allows us to analyze the surroundings of the prediction where the real pulse locates.

S-4
Supporting Note 2: Detailed results from PETR  Finally, the fifth row shows the Coverage. First column (Dur 1) corresponds to traces with translocations whose duration is 0.5 ms. Second column (Dur 2) corresponds to traces with translocations whose duration is 5 ms. Finally, the third column (Dur 3) corresponds to traces with translocations whose duration is 0.5 ms but such traces are interpolated with additional points in order to achieve an apparent duration of 5 ms. The value above each figure shows the average and standard deviation of the three datasets. First value (0.5), corresponds to traces with translocations whose duration is 0.5 ms. Second value (5.0), corresponds to traces with translocations whose duration is 5 ms. Finally, third value (5.0*), corresponds to traces with translocations whose duration is 0.5 ms but such traces are interpolated with additional points to achieve an apparent duration of 5 ms.

Using IoU on SNR = 4 dataset
Details of the performance measured by the standard mAP for different translocation durations can be seen in Fig. S13. By evaluating our model using the standard mAP, the obtained results still show a ≥ 0.3 mAP for 5 ms duration spikes, which is comparable to detectors used in other application scenarios [1][2][3][4].

Supporting Note 3: Comparison with B-Net and traditional method
The traditional method to detect pulses is based on an amplitude threshold referring to the baseline [5]. If the amplitude of a fluctuation in the signal trace surpasses this threshold, it is recognized as a pulse. Here, the threshold-based algorithm is implemented by a MATLAB program to locate the translocation spikes in current traces. In the program, the function findpeaks is adopted with the MinPeakProminence method [6,7]. An amplitude threshold is defined by the user regarding the Root-Mean-Square (RMS) of the background noise level. This threshold is tuned from 4 to 25 multiples of the background noise RMS to demonstrate the dependence of the results on the threshold selection.
The spike duration and frequency of λ-DNA and streptavidin translocation from traditional algorithm and PETR are compared in Fig. S14. The average values of the spike duration and frequency from the traditional algorithm are dependent on the selection of threshold amplitude, as shown by the deviations among the doton lines in the respective figures. "n" after "th" (for threshold) in the legend denotes a specific threshold level, measured by the number of multiples of the peak-to-peak value of the background noise. The feature extracted by the traditional algorithm are highly dependent on the selection of amplitude threshold, indicating the subjectivity of the threshold-based algorithm.
In the traditional method, different thresholds for distinguishing spikes from the background noise fluctuation assigned by the user give totally different statistical results on duration and frequency, not only the values, but also the trends with increasing bias voltage. Using PETR does not suffer from such subjectivity. Spike detection with PETR is based on the acquisition of spike features, referring to the properties of the background noise, by the neural network during the training process, instead of any userdefined parameter.

Supporting Note 4: DBC AND ADEPT
The spike segment outputs from PETR are processed using two published algorithms, Second-Order-Differential-Based Calibration (DBC) [8] and ADEPT [9]. The purpose these two algorithms share is to extract the features of each spikes, i.e., the duration and amplitude of the spikes. The procedure of DBC is illustrated as follows using an example (Fig. S15). 1. The spike segment is fitted by an 8-order Fourier series to smooth out the noise, i.e., orange curve in Fig.  S15(a).
2. The second-order derivative of the smoothed spike segment is calculated, i.e., pink curve in Fig. S15(b). 3. The valleys of the second-order derivative are founded, i.e., green dots in Fig. S15(b), from which the first-and second-smallest two are registered as the start and end time points of the spike, respectively, i.e., circled by the blue squares in Fig. S15(b). Then, the duration is the time interval between the start and end points. 4. The amplitude is calculated by averaging the area of spike waveform below the baseline during the translocation over the entire duration. The procedure of ADEPT is illustrated as follows using an example (Fig. S16). 1. The baseline of the spike segment is subtracted, i.e., Fig. S16(a). 2. The segment is fitted by a target function:

S-18
Where, H(t) is the heaviside function, marking the start time, ts, and end time, te, of the translocation. The two exponential functions describe the decrease and increase relaxation processes of the current caused by delay characteristics of the system triggered by the entering and exiting of the analyte during the translocation. τ1 and τ2 are the relaxation time constants for the current decrease and increase, respectively ( Fig. S16(b)). 3. The duration can be calculated as the time difference between the start and end points, i.e., te − ts. The amplitude is the fitting parameter a.
The extracted amplitude and duration of the spike segments by means of ADEPT and DBC are shown in Fig. S17. Three typical examples of the spike segments of λ-DNA and streptavidin are displayed in Fig.  S17(a)-(b) with the start and end time points predicted by PETR marked by red and green stars, respectively. The spike amplitude increases with raising the bias voltage, which is reasonable since a higher voltage induces a larger ionic current through the nanopore. In general, DBC extracts a smaller spike amplitude compared to ADEPT. It relates to the flatting of the spike amplitude along the translocation time span, i.e., averaging the changing current in the translocation duration. The spike duration of λ-DNA and streptavidin in Fig. S17(c)-(d) shows similar trends with bias voltage as those extracted directly by PETR ( Fig. S14(a)-(b). Hence, the results from both methods, as well as the spike segments detected by the PETR, are deemed reliable.

S-19
Fig. S17. Results from DBC and ADEPT as two demonstrations of further processing the output spike segments from PETR. Spike amplitude average with its spread of (a) DNA and (b) streptavidin translocation data. Spike duration average with its spread of (c) DNA and (d) streptavidin translocation data.